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Abstract 

We show how the complexity of higher-order functional programs can be analysed auto¬ 
matically by applying program transformations to a defunctionalized versions of them, and 
feeding the result to existing tools for the complexity analysis of first-order term rewrite sys¬ 
tems. This is done while carefully analysing complexity preservation and reflection of the 
employed transformations such that the complexity of the obtained term rewrite system re¬ 
flects on the complexity of the initial program. Further, we describe suitable strategies for the 
application of the studied transformations and provide ample experimental data for assessing 
the viability of our method. 


1 Introduction 

Automatically checking programs for correctness has attracted the attention of the computer 
science research community since the birth of the discipline. Properties of interest are not neces¬ 
sarily functional, however, and among the non-functional ones, noticeable cases are bounds on the 
amount of resources (like time, memory and power) programs need when executed. 

Deriving upper bounds on the resource consumption of programs is indeed of paramount impor¬ 
tance in many cases, but becomes undecidable as soon as the underlying programming language is 
non-trivial. If the units of measurement become concrete and close to the physical ones, the prob¬ 
lem gets even more complicated, given the many transformation and optimisation layers programs 
are applied to before being executed. A typical example is the one of WCET techniques adopted 
in real-time systems [52], which do not only need to deal with how many machine instructions a 
program corresponds to, but also with how much time each instruction costs when executed by 
possibly complex architectures (including caches, pipelining, etc.), a task which is becoming even 
harder with the current trend towards multicore architectures. 

As an alternative, one can analyse the abstract complexity of programs. As an example, one 
can take the number of instructions executed by the program or the number of evaluation steps to 
normal form, as a measure of its execution time. This is a less informative metric, which however 
becomes accurate if the actual time complexity of each instruction is kept low. One advantage of 
this analysis is the independence from the specific hardware platform executing the program at 
hand: the latter only needs to be analysed once. This is indeed a path which many have followed 
in the programming language community. A variety of verification techniques have been employed 
in this context, like abstract interpretations, model checking, type systems, program logics, or 
interactive theorem provers; see [3, 5, 34, 48] for some pointers. If we restrict our attention to 
higher-order functional programs, however, the literature becomes much sparser. 

‘This work was partially supported by FWF project number J3563, FWF project number P25781-N15 and by 
French ANR project Elica ANR-14-CE25-0005. 
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Figure 1: Complexity Analysis by HOCA and FOPs. 


Conceptually, when analysing the time complexity of higher-order programs, there is a fun¬ 
damental trade-off to be dealt with. On the one hand, one would like to have, at least, a clear 
relation between the cost attributed to a program and its actual complexity when executed: only 
this way the analysis’ results would be informative. On the other hand, many choices are available 
as for how the complexity of higher-order programs can be evaluated, and one would prefer one 
which is closer to the programmer’s intuitions. Ideally, then, one would like to work with an 
informative, even if not-too-concrete, cost measure, and to be able to evaluate programs against 
it fully automatically. 

In recent years, several advances have been made such that the objectives above look now 
more realistic than in the past, at least as far as functional programming is concerned. First of all, 
some positive, sometime unexpected, results about the invariance of unitary cost models^ have 
been proved for various forms of rewrite systems, including the A-calculus [1, 6, 18]. What these 
results tell us is that counting the number of evaluation steps does not mean underestimating the 
time complexity of programs, which is shown to be bounded by a polynomial (sometime even by 
a linear function [2]) in their unitary cost. This is good news, since the number of rewrite steps 
is among the most intuitive notions of cost for functional programs, at least when time is the 
resource one is interested in. 

But there is more. The rewriting-community has recently developed several tools for the au¬ 
tomated time complexity analysis of term rewrite system, a formal model of computation that is 
at the heart of functional programming. Examples are AProVE [25], CbT [53], and lU [8]. These 
first-order provers (FOPs for short) combine many different techniques, and after some years of 
development, start being able to treat non-trivial programs, as demonstrated by the result of 
the annual termination competition.^ This is potentially very interesting also for the complexity 
analysis of higher-order functional programs, since well-known transformation techniques such as 
defunctionalisation [46] are available, which turn higher-order functional programs into equivalent 
first-order ones. This has been done in the realm of termination [24, 42], but appears to be infea¬ 
sible in the context of complexity analysis. Conclusively this program transformation approach 
has been reflected critical in the literature, cf. [34]. 

A natural question, then, is whether time complexity analysis of higher-order programs can 
indeed be performed by going through first-order tools. Is it possible to evaluate the unitary cost 
of functional programs by translating them into first-order programs, analysing them by existing 
first-order tools, and thus obtaining meaningful and informative results? Is, e.g., plain defunction¬ 
alisation enough? In this paper, we show that the questions above can be answered positively, 
when the necessary care is taken. We summarise the contributions of this paper. 

1. We show how defunctionalisation is crucially employed in a transformation from higher-order 
programs to first-order term rewrite systems, such that the time complexity of the latter reflects 
upon the time complexity of the former. More precisely, we show a precise correspondence 
between the number of reduction steps of the higher-order program, and its defunctionalised 
version, represented as an applicative term rewrite systems (see Proposition 2). 

2. But defunctionalisation is not enough. Defunctionalised programs have a recursive structure 
too complicated for FOPs to be effective on them. Our way to overcome this issue consists in 
further applying appropriate program transformations. These transformations must of course 
be proven correct to be viable. Moreover, we need the complexity analysis of the transformed 

^In the unitary cost model, a program is attributed a cost equal to the number of rewrite steps needed to turn 
it to normal form. 

^http ://termination-portal.org/wiki/Termination_Competition. 
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program to mean something for the starting program, i.e., we also prove the considered trans¬ 
formations to be at least complexity reflecting, if not also complexity preserving. This addresses 
the problem that program transformations may potentially alter the resource usage. We es¬ 
tablish inlining (see Corollary 1), instantiation (see Theorem 2), uncurrying (see Theorem 3), 
and dead code elimination (see Proposition 4) as, at least, complexity reflecting program trans¬ 
formations. 

3. Still, analysing abstract program transformations is not yet sufficient. The main technical 
contribution of this paper concerns the automation of the program transformations rather than 
the abstract study presented before. In particular, automating instantiation requires dealing 
with the collecting semantics of the program at hand, a task we pursue by exploiting tree 
automata and control-flow analysis. Moreover, we define program transformation strategies 
which allow to turn complicated defunctionalised programs into simpler ones that work well in 
practice. 

4. To evaluate our approach experimentally, we have built HOCA.^ This tool is able to translate 
programs written in a pure, monomorphic subset of DCaml, into a first-order rewrite system, 
written in a format which can be understood by major first-order tools. 

The overall flow of information is depicted in Figure 1. Note that by construction, the obtained 
certificate reflects onto the runtime complexity of the initial OCaml program, taking into account 
the standard semantics of OCaml. The figure also illustrates the modularity of the approach, 
as the here studied subset of OCaml just serves as a simple example language to illustrate the 
method: related languages can be analysed with the same set of tools, as long as the necessary 
transformation can be proven sound and complexity reflecting. 

Our testbed includes standard higher-order functions like foldl and map, but also more involved 
examples such as an implementation of merge-sort using a higher-order divide-and-conquer com- 
binator as well as simple parsers relying on the monadic parser-combinator outlined in Okasaki’s 
functional pearl [41]. We emphasise that the methods proposed here are applicable in the context 
of non-linear runtime complexities. The obtained experimental results are quite encouraging. 

The remainder of this paper is structures as follows. In the next section, we present our 
approach abstractly on a motivating example and clarify the challenges of our approach. In 
Section 3 we then present defunctionalisation formally. Section 4 presents the transformation 
pipeline, consisting of the above mentioned program transformations. Implementation issues and 
experimental evidence is given in Section 5 and 6, respectively. Finally, we conclude in Section 7, 
by discussing related work. 

2 On Defunctionalisation: Ruling the Chaos 

The main idea behind defunctionalisation is conceptually simple: function-abstractions are rep¬ 
resented as first-order values; calls to abstractions are replaced by calls to a globally defined 
apply-function. Consider for instance the following DCaml-program: 

let comp f g = fun z ^ f ig z) ;; 

let rec walk xs = 
match xs with 

[] —>■ (fun z ^ z) 

\ x:: ys —> comp (walk ys) 

(fun z ^ X :: z) ;; 
let rev I = walk I [] ;; 

let main I = rev I ; ; 

Run on a list of n elements, walk hrst constructs a function which reverses its first argument and 
appends it to the second argument. This function, which can be easily defined by recursion, is fed 

®Our tool HOCA is open source and available under http://cbr.uibk.ac.at/tools/hoca/. 
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in rev with the empty list. The function main only serves the purpose of indicating the complexity 
of which function we are interested at. 

Defunctionalisation can be understood already at this level. We first define a datatype for 
representing the three abstractions occurring in the program: 

type ’a cl = 

Cli of ’a cl * ’a cl (* fun z ^ f (g z) *) 

I CI 2 (* fun z z *) 

I CI 3 of ’ a (* fun z —>■ X : : z *) 

More precisely, an expression of type ’ a cl represents a function closure, whose arguments are 

used to store assignments to free variables. An infix operator ( 0 ), modelling application, can then 
be defined as follows:"*’ 

let rec (0) cl z = 
match cl with 

Cli(f,g) f <& {g z) 

I CI2 —y z 
I ClsC a;) —>■ x: : z ;; 

Using this function, we arrive at a first-order version of the original higher-order function: 

let comp f g = Cli(/, g) ; ; 
let rec walk xs = 
match xs with 
[] —^ CI 2 

\ x\\ys —> comp (walk ys) ClsCa;) ;; 

let rev I = walk 1 0 [] ; ; 

let main I = rev I ; ; 

Observe that now the recursive function walk constructs an explicit representation of the closure 
computed by its original definition. The function ( 0 ) carries out the remaining evaluation. This 
program can now already be understood as a first-order rewrite system. 

Of course, a systematic construction of the defunctionalized program requires some care. For 
instance, one has to deal with closures that originate from partial function applications. Still, the 
construction is quite easy to mechanize, see Section 3 for a formal treatment. On our running 
example, this program transformation results in the rewrite system Arev, which looks as follows:® 

; Cliif, g) ® z f @ (g @ z) 

2 CI 2 0 2 —>■ Z 

3 ClaC x) 0 z —>■ X-. : z 

4 compi(/) 05 ^ Cli(/ , g) 

5 comp 0 / ^ compiC/) 

6 match„aik ( [] ) CI 2 

7 match„aik (2:: : ys) 

comp 0 (fix„aik ® ys) 0 ClsCa;) 

8 walk @ xs ^ match„aik ( 2;s) 

9 fix„aik ® xs ^ walk 0 xs 

“^The definition is rejected by the QCaml type-checker, which however, is not an issue in our context. 

® In .4rev, rule (9) refiects that, under the hood, we treat recursive let expressions as syntactic sugar for a 
dedicated fixpoint operator. 
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10 rev @ I fix„aik 

11 mainCn —>■ rev @ I 


@ Z 0 [] 


Despite its conceptual simplicity, current FOPs are unable to effectively analyse applicative 
rewrite systems, such as the one above. The reason this happens lies in the way FOPs work, 
which itself reflects the state of the art on formal methods for complexity analysis of first-order 
rewrite systems. In order to achieve composability of the analysis, the given system is typically 
split into smaller parts (see for example [9]), and each of them is analysed separately. Furthermore, 
contextualisation (aka path analysis [30]) and a suitable form of flow graph analysis (or dependency 
pair analysis [29, 40]) is performed. However, at the end of the day, syntactic and semantic basic 
techniques, like path orders or interpretations [50, Chapter 6] are employed. All these methods 
focus on the analysis of the given defined symbols (like for instance the application symbol in the 
example above) and fail if their recursive definition is too complicated. Naturally this calls for a 
special treatment of the applicative structure of the system [31]. 

How could we get rid of those (0), thus highlighting the deep recursive structure of the program 
above? Let us, for example, focus on the rewriting rule 

Cli(f,g) ® z -P- f @ (g ® z) , 

which is particularly nasty for FOPs, given that the variables / and g will be substituted by 
unknown functions, which could potentially have a very high complexity. How could we simplify 
all this? The key observation is that although this rule tells us how to compose two arbitrary 
closures, only very few instances of the rule above are needed, namely those were g is of the 
form ClsCa;), and / is either CI 2 or again of the form Cli (/’ ,g’). This crucial information can 
be retrieved in the so-called collecting semantics [39] of the term rewrite system above, which 
precisely tells us which object will possibly be substituted for rule variables along the evaluation 
of certain families of terms. Dealing with all this fully automatically is of course impossible, but 
techniques based on tree automata, and inspired by those in [32] can indeed be of help. 

Another useful observation is the following: function symbols like, e.g., comp or match„aik are 
essentially useless: their only purpose is to build intermediate closures, or to control program flow: 
One could simply shortcircuit them, using a form of inlining. And after this is done, some of the 
left rules are dead code, and can thus be eliminated from the program. At the end of the day, we 
arrive at a truly first-order system and uncurrying brings it to a format most suitable for FOPs. 

If we carefully apply the just described ideas to the example above, we end up with the following 
first-order system, called T^rev, which is precisely what HOCA produces in output: 

/ Cl](CI 2 .ClsCa;) , z) —>■ x\ \ z 

2 Cl[(Cli(/, g) .ClsC a:) , 2 ) ClJ(/, g , a;: : 2 ) 

3 f ( [] ) CI 2 

^ f i^waik ( 2 ;: -> CliCf (ys) .ClsCa;)) 

5 main ( 1) —!> [] 

6 main(3 ;:?/s) Clj( f (ys) .ClaCa;) , [] ) 

This term rewrite system is equivalent to Arev from above, both extensionally and in terms of the 
underlying complexity. However, the FOPs we have considered can indeed conclude that main has 
linear complexity, a result that can be easily lifted back to the original program. 

Sections 4 and 5 are concerned with a precise analysis of the program transformations we 
employed when turning Arev into IZrey Before that, we recap central definitions in the next 
section. 

3 Preliminaries 

The purpose of this section is to give some preliminary notions about the A-calculus, term rewrite 
systems, and translations between them; see [10, 43, 50] for further reading. 
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To model a reasonable rich but pure and monomorphic functional language, we consider a 
typed A-calculus with constants and fixpoints akin to Plotkin’s PCF [44]. To seamlessly express 
programs over algebraic datatypes, we allow constructors and pattern matching. To this end, 
let Ci,...,Cfc be finitely many constructors, each equipped with a fixed arity. The syntax of 
PCF-programs is given by the following grammar: 

Exp e, / ::= x \ Ci(e) | Xx.e | e / | fix(a;.e) 

I match e {Ci(Ti) Ci; • • • ; Ck{xk) H- Ck} , 

where x ranges over variables. Note that the variables Xi in a match-expression are considered 
bound in Bi. A simple type system can be easily defined based on a single ground type, and 
on the usual arrow type constructor. We claim that extending the language with products and 
coproducts would not be problematic. 

We adopt weak call-by-value semantics, the definition is standard, see e.g. [28]. Here weak 
means that reduction under any A-abstraction Aa;.e and any fixpoint-expressions fix(a;.e) is prohib¬ 
ited. Call-by-value means that in a redex e /, the expression e has to be evaluated to a value first. 
A match-expression match e {Ci( 2 fi) i—>■ ei; • • • ; Ck{xk) t Cfe} is evaluated by first evaluating the 
guard e to a value Ci{v), reduction then continues with the corresponding case-expression Ci with 
values Vi substituted for variables Xi. The one-step weak call-by-value reduction relation is denoted 
by -^v Elements of the term algebra over constructors Ci,..., Cfc embedded in our language are 
collected in Input. A PCF-program with n input arguments is a closed expression P = \xi ■ ■ ■ \Xn-e 
of hrst-order type. What this implicitly means is that we are interested in an analysis of programs 
with a possibly very intricate internal higher-order structure, but whose arguments are values 
of ground type. This is akin to the setting in [11] and provides an intuitive notion of runtime 
complexity for higher-order program, without having to rely on ad-hoc restrictions on the use of 
function-abstracts (as e.g. [34]). This way we also ensure that the abstractions reduced in a run of 
P are the ones found in P, an essential property for performing defunctionalisation. We assume 
that variables in P have been renamed apart, and we impose a total order on variables in P. The 
free variables FV(e) in the body e of P can this way be dehned as an ordered sequence of variables. 

Example 1. We hx constructors [] and (: :) for lists, the latter we write infix. Then the program 
computing the reverse of a function, as described in the previous section, can be seen as the PCF 
term P^v := XL rev I where 


rev = XlA\x{w.walk) I [] ; 

walk = Aa;s.match xs I ^ Xz.z , 

(x: :yst-^ comp [w ys) [Xz.x: :z) 

comp = Xf.Xg.Xz.f {g z) . 


The second kind of programming formalism we will deal with is the one of term rewrite systems 
(TRSs for short). Let P = {f i,..., f„} be a set of function symbols, each equipped again with an 
arity, the signature. We denote by s, t,... terms over the signature P, possibly including variables. 
A position p in t is a finite sequence of integers, such that the following definition of subterm at 
position p is well-defined: t\^ = t for the empty position e, and t\ip = ti Ip for t = f{ti, .. .,tk). For 
a position p in t, we denote by t[s]p the term obtained by replacing the subterm at position p in 
t by the term s. A context C is a term containing one occurrence of a special symbol □, the hole. 
We define C\t\ := C[t]p for p the position of □ in C, i.e., Clp = □. 

A substitution, is a hnite mapping a from variables to terms. By ta we denote the term 
obtained by replacing in t all variables x in the domain of a by a{x). A substitution cr is at least 
as general as a substitution r if there exists a substitution t' such that t[x) = a{x)T' for each 
variable x. A term t is an instance of a term s if there exists a substitution cr, with scr = t; the 
terms t and s unify if there exists a substitution p, the unifier, such that tp. = sp. If two terms 
are unifiable, then there exists a most general unifier (mgu for short). 

A term rewrite system P is a finite set of rewrite rules, i.e., directed equations i{li,... ,lk) r 
such that all variables occurring in the right-hand side r occur also in the left-hand side f{li,... ,lk). 
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xa G T{Ctz) for all variables x occurring in f (li, ..., /„) 
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Figure 2: Call-by-value rewrite relation with respect to a TRS TZ. 


The roots of left-hand sides, the defined symbols of TZ, are collected in 'D-jz, the remaining symbols 
J- \ Vjz are the constructors of TZ and collected in Ctj. Terms over the constructors Cjz are 
considered values and collected in T(Ck). We adopt call-by-value semantics for TRSs, see Figure 2 
where the call-by-value rewrite relation —is defined. 

Throughout the following, we consider non-ambiguous rewrite systems, that is, the left-hand 
sides are pairwise non-overlapping. Even thought —> 7 ^ may be non-deterministic, the following 
special case of the parallel moves lemma [ 10 ] tells us that this form of non-determinism is not 
harmful for complexity-analysis. 

Proposition 1. For a non-ambiguous TRS TZ, all normalising reductions of t have the same 
length, i.e, if t ui and t — U 2 for two irreducible terms ui and U 2 , then ui = U 2 and m = n. 

An applicative term rewrite system (ATRS for short) is usually defined as a TRS over a 
signature consisting of a finite set of nullary function symbols and one dedicated binary symbol 
(@), the application symbol. We follow the usual convention that (@) associates to the left. Here, 
we are more liberal and just assume the presence of (@), and allow function symbols that take 
more than one argument. Throughout the following, we are foremost dealing with ATRSs, which 
we denote by A,B below. We also write (@) infix and assume that it associates to the left. 

In the following, we show that every PCF-program P can be seen as an applicative term 
rewrite system Ap. To this end, we first define an infinite schema ApcF of rewrite rules which 
allows us to evaluate the whole of PCF. The signature underlying ApcF contains, besides the 
application-symbol ( 0 ) and constructors Ci,..., Ck, the following function symbols, called closure 
constructors: (i) for each PCF term Xx.e with n free variables an n-ary symbol (h) for each 

PCF term fix(a;.e) with n free variables an n-ary symbol f and (iii) for each match-expression 
match e {cs} with n free variables a symbol matches of arity n -F 1. Furthermore, We define a 
mapping [•]$ from PCF terms to ApcF terms as follows. 

[j;]$ := X ; 

[Aa;.e]$ := lam 2 ;.e(f), where x = \-\/{Xx.e) ; 

[Ci(ei,..., Cfe) ]$ := Ci([ei]$,..., [efc]$) ; 

[/ e]<[. := [/]$ @ [e]$ ; 

[fix(a;.e) ]$ := ±ixj;,e{x), where x = FV(fix(a;.e)) ; 

[ match e {cs} ]$ := matches([e]$, x), where x = FV({cs}) . 

Based on this interpretation, each closure constructor is equipped with one or more of the following 
defining rules: 


lam2;.e(f) @ 3; ^ [e]$ ; 

fix2;.e(a;) 0 2 / —>• [e{fix(a;.e)/a;} ]$ 0 y , where y is fresh; 
matchcs(Ci(fi), x) -)• [ei]$ , for i = 1,..., fc. 

Here, we suppose cs = {Ci(fi) 1 —>■ ei; • • • ; Ck(xk) 1 —>■ Cfej. 

For a program P = Aa;i • • • Xxn-e, the ATRS Ap (i) contains a rule main(a:i,..., Xn) [e]$) 
where main is a dedicated function symbol; together with (ii) the least subset of Apcf that defines 
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all closure constructors occurring in Ap. Crucial, Ap is always finite, in fact, the size of Ap is 
linearly bounded in the size of P, see below. 

Remark. This statement becomes trivial if we consider alternative defining rule 

f±x^,e{x) @ y ^ [e]^{x/±i.iLx.e{x)} @ y , 

which would also correctly model the semantics of fixpoints fix( 2 :.e). Then the closure constructors 
occurring in Ap are all obtained from sub-expressions of P. Our choice is motivated by the fact 
that closure constructors of fixpoints are propagates to call sites, something that facilitates our 
transformation approach to complexity analysis. 

Example 2. The expression P^ev from Example 1 gets translated into the ATRS = Arev we 
introduced in Section 2. 

We obtain the following simulation result 

Proposition 2. Every -^y-reduction of an expression P di ■■■ dn (dj S InputJ is simulated 
step-wise by a call-by-value Ap-derivation starting from main{di, ..., dn). 

As the inverse direction of this proposition can also be stated, Ap can be seen as a sound and 
complete, in particular step-preserving, implementation of the PCF-program P. 

In correspondence to Proposition 2, we define the runtime complexity of an ATRS A as follows. 
As above, only terms d G Input built from the constructors C are considered valid inputs. The 
runtime of A on inputs di,..., d„ is defined as the length of the longest rewrite sequence starting 
from main(di,..., d„). The runtime complexity function is defined as the (partial) function which 
maps the natural number m to the maximum runtime of A on inputs di,... ,dn with Mil ^ 
where the size |d| is defined as the number of occurrences of constructors in d. 

Crucial, our notion of runtime complexity corresponds to the notion employed in first-order 
rewriting and in particular in FOPs. Our simple form of defunctionalisation thus paves the way 
to our primary goal: a successful complexity analysis of Ap with rewriting-based tools can be 
relayed back to the PCF-program P. 

4 Complexity Reflecting Transformations 

The result offered by Proposition 2 is remarkable, but is a Pyrrhic victory towards our final goal: 
as discussed in Section 2, the complexity of defunctionalised programs is hard to analyse, at least 
if one wants to go via FOPs. It is then time to introduce the four program transformations that 
form our toolbox, and that will allow us to turn defunctionalised programs into ATRSs which are 
easier to analyse. 

In this section, we describe the four transformations abstractly, without caring too much about 
how one could implement them. Rather, we focus on their correctness and, even more importantly 
for us, we verify that the complexity of the transformed program is not too small compared to the 
complexity of the original one. We will also show, through examples, how all this can indeed be 
seen as a way to simplify the recursive structure of the programs at hand. 

A transformation is a partial function / from ATRSs to ATRSs. In the case that f{A) is 
undefined, the transformation is called inapplicable to A. We call the transformation / (asymp¬ 
totically) complexity reflecting if for every ATRS A, the runtime complexity of A is bounded 
(asymptotically) by the runtime complexity of f{A), whenever / is applicable on A. Conversely, 
we call / (asymptotically) complexity preserving if the runtime complexity of f(A) is bounded 
(asymptotically) by the complexity of A, whenever / is applicable on A. The former condition 
states a form of soundness: if / is complexity reflecting, then a bound on the runtime complexity 
of /(A) can be relayed back to A. The latter conditions states a form of completeness: applica¬ 
tion of a complexity preserving transformation / will not render our analysis ineffective, simply 
because / translated A to an inefficient version. We remark that the set of complexity preserving 
(complexity reflecting) transformations is closed under composition. 
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4.1 Inlining 

Our first transformation constitutes a form of inlining. This allows for the elimination of auxiliary 
functions, this way making the recursive structure of the considered program apparent. 

Consider the ATRS Arev from Section 2. There, for instance, the call to walk in the definition 
of f ix„aik could be inlined, thus resulting in a new definition: 

fix„aik @ xs -o match„aik (a;s) 

Informally, thus, inlining consists in modifying the right-hand-sides of ATRS rules by rewriting 
subterms, according to the ATRS itself. We will also go beyond rewriting, by first specializing 
arguments sufficiently so that a rewrite triggers. In the above rule for instance, match„aik cannot 
be inlined immediately, simply because match„aik is dehned itself by case analysis on xs. To allow 
inlining of this function nevertheless, we specialize xs to the patterns [] and x'. : ys, the patterns 
underlying the case analysis of match„aik- This results in two alternative rules for f ix„aik, namely 

f ^ walk of]—:- match„aik ( [] ) 
f ^ walk 0 ix-.-.ys) — match„aik ( a;: : j/s ) . 

Now we can inline match„aik, and as a consequence the rules defining fix„aik are easily seen to be 
structurally recursive, a fact that FOPs can recognise and exploit. 

A convenient way to formalise inlining is by way of narrowing [10]. We say that a term s 
narrows to a term t at a non-variable position p in s, in notation s '^a,p t-, if there exists a rule 
Z —>■ r S A such that /r is a unifier of left-hand side I and the subterm s|p (after renaming apart 
variables in Z —r and s) and t = sp,[rp\p. In other words, the instance sfi of s rewrites to t at 
position p with rule I ^ r € A. The substitution p is just enough to uncover the corresponding 
redex in s. Note however that the performed rewrite step is not necessarily call-by-value, the mgu 
/i could indeed contain function calls. We define the set of all inlinings of a rule Z —> r at position 
p which is labeled by a defined symbol by 

inline^, p(Z r) := {Ip r' | r r'} . 

The following example demonstrates inlining through narrowing. 

Example 3. Consider the substitutions pi = [xs ^ []} and p 2 = {a;s >->■ x: : j/s}. Then we have 
match„aik(a:s) ^Aev.e CI 2 

match„aik(a:s) comp (S (fix^aik® ys) 0 ClaCa;) . 

Since no other rule of Arev unifies with the right-hand side match„aik(a:s), the set 

inlineAe„.e(f ixwaik 0 xs match„aik( 2 ;s)) 


consists of the two rules 

fiXyalk 0 [] —^ CI 2 

f ix„aik 0 ix: : ys) -)■ 

comp 0 (fix„aik 0 ys) 0 ClaCa;) . 

Inlining is in general not complexity reflecting. Indeed, inlining is employed by many compil¬ 
ers as a program optimization technique. The following examples highlight two issues we have to 
address. The first example indicates the obvious: in a call-by-value setting, inlining is not asymp¬ 
totically complexity reflecting, if potentially expensive function calls in arguments are deleted. 

Example 4. Consider the following inefficient system: 
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1 'k.ix, y) —>• X 

2 main(O) —> 0 

3 main ( S ( 7T.) ) —^ k ( main ( n) , main ( n) ) 

Inlining k in the definition of main results in an alternative definition main(S(rt)) -^main(n) 
of rule (3), eliminating one of the two recursive calls and thereby reducing the complexity from 
exponential to linear. 

The example motivates the following, easily decidable, condition. Let I —)> r denote a rule 
whose right-hand side is subject to inlining at position p. Suppose the rule u ^ v € Ais unifiable 
with the subterm r\p of the right-hand side r, and let p denote the most general unifier. Then 
we say that inlining r\p with u ^ v is redex preserving if whenever xp contains a defined symbol 
of ,4, then the variable x occurs also in the right-hand side v. The inlining Z —>■ r at position p 
is called redex preserving if inlining r\p is redex preserving with all rule u ^ v that unify with 
r\p. Redex-preservation thus ensures that inlining does not delete potential function calls, apart 
from the inlined one. In the example above, inlining k(main(n) ,main(n)) is not redex preserving 
because the variable y is mapped to maiii(n) by the underlying unifier, but y is deleted in the 
inlining rule k(a;, 2 /) x. 

Our second example is more subtle and arises when the studied rewrite system is under- 
specified: 

Example 5. Consider the system consisting of the following rules. 

/ h (a;, 0) X 

2 main(O) —)■ 0 

3 main(S(7r)) —h ( main ( n) , n) 


Inlining h in the definition of main will specialise the variable n to 0 and thus replaces rule (3) 
by main(S(0)) ^main(O). Note that the runtime complexity of the former system is linear, 
whereas its runtime complexity is constant after transformation. 

Crucial for the example, the symbol h is not sujficiently defined, i.e., the computation gets 
stuck after completely unfolding main. To overcome this issue, we require that inlined functions 
are sufficiently defined. Here a defined function symbol f is called sufficiently defined, with respect 
to an ATRS A, if all subterms f (t) occurring in a reduction of main(di,..., dn) {dj G Input) are 
reducible. This property is not decidable in general. Still, the ATRSs obtained from the translation 
in Section 3 satisfy this condition for all defined symbols: by construction, reductions do not get 
stuck. Inlining, and the transformations discussed below, preserve this property. 

We will now show that under the above outlined conditions, inlining is indeed complexity 
reflecting. Fix an ATRS A. The following auxiliary lemma follows by a standard induction on the 
length of derivations, see e.g. [31]. As a consequence, we can assume that reductions have a very 
specific form. 

Lemma 1. 

1. If C[t] u is a normalizing derivation, then C[t] C[s] —>[ 4 ^ u for some normalform 

s of t and mi, m 2 G N with mi -|- m 2 = m. 

2. If ta u is a normalizing derivation, then ta tr —u for some normalized 

substitutions t and mi, m 2 G N with mi + m 2 = m. 

In proofs below, we denote by an extension of —>_4 where not all arguments are necessarily 
reduced, but where still a step cannot delete redexes: s t if s = C[la] and t = C[r(j] for a 
context C, rule I ^ r G A and a substitution a which satisfies cr( 2 :) G T{Cj,) for all variables x 
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which occur in I but not in r. By definition, — C The relation is just enought to 

capture rewrites performed on right-hand sides in a complexity refiecting inlining. 

The next lemma collects the central points of our correctness proof. Here, we first considers 
the effect of replacing a single application of a rule I r with an application of a corresponding 
rule in inline^_p(Z —>■ r). As the lemma shows, this is indeed always possible, provided the inlined 
function is sufficiently defined. Crucial, inlining preserves not only semantics, but complexity 
reflecting inlining does not optimize the ATRS under consideration too much, if at all. 

Lemma 2. Let I ^ r be a rewrite rule subject to a redex preserving inlining of function f at 
position p in r. Suppose that the symbol f is sufficiently defined by A. Consider a normalising 
reduction 

main{di ,..., dn) — C\la] C[ra] —u , 

for di S Input (i = 1,... and some G N. Then there exists a term t such that the following 
properties hold: 

^inline^.pp^r) ^nd 

2. ra t, where T collects all rules that are unifiable with the left-hand side r at position p; 
and 

3. C[t] u. 

Proof. Consider the first property, under the assumptions of the lemma. Since f is sufficiently 
defined, the subterm r\pa of ra rooted in f is a redex, in particular, ripcr matches the left-hand 
side u of a rule u ^ v € I, say r\pa = ut for some substitution r. Wlog. we suppose that the 
rules in A are variable disjoint with r. Hence cr l±l r is a well-defined unifier of r\p and u. Let p, be 
a most general unifier of r\p and u. We thus have a substitution a'^ such that for all variables x in 
r|p, a{x) = pL{x)a'p holds. Let a^ be the least extension of cr^ such that ap(x) = a(x) for variables 
in I which do not occur in r|p. We conclude la = {lfJ,)ap ^iniine^ where the 

equality follows by definition of cr^, and the step by definition of inline^_p(l —^ r). The property 
follows by taking t = {rpi[vpt]p)ap. 

Now for the second property, recall la = (Ipfa^ — {rp)ap = ra. Let D denote the context 
obtained by replacing the subterm at position p in ra by the hole □, hence 

D = ra[U]p = (r^)CTp[n]p = (r/r[n]p)crp . 

Since p, is an mgu of r|p and u, we thus have 

ra = D[{r\pp)ap\ = D[{up)ap\ . 

Then it is not difficult to conclude that ra '^x D[(vp)ap\ = t, using that u —)> u G X is redex 
preserving wrt. the considered inlining and that a^ contain no defined symbols. 

For the final property, consider the sequence C[ra] u, for u in normalform. As we observed 
before, ra = D[ut\ for the context D defined above, u £ v £ I and r a substitution. Using 
Lemma 1, and employing that redexes are non-overlapping by assumption on A., we can thus 
obtain an alternate derivation of equal length, where we first completely reduce ut: 

C[ra] = C[D[ut]] C[D[uTn\] C[D[vTn]] ^ ■ 

Here, r„ is the normalised substitution obtained by normalising r, and £ = £i -f- £2 + 1- Note that 
by construction, we have t = D[vt]. Guided by the above derivation we see 

C[t] = C[D[vt]] -4^1 C[D[vTn]] u . 

Using that the step D[ut] '^x D[vt] is not deleting redexes occurring in the substitution t by 
definition, we have fci > £ 1 . In total, the last sequence is thus of length ki £2 £1 + £ 2 - From 

the definition of £, the last property follows. □ 
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In consequence, we thus obtain a term t 

main(di,..., d„) C[la] C'[t] ^ u , 

holds under the assumptions of the lemma. Complexity preservation of inlining, modulo a constant 
factor under the outlined assumption, now follows essentially by induction on the maximal length 
of reductions. As a minor technical complication, we have to consider the broader reduction 
relation instead of —>'_ 4 . To ensure that the induction is well-defined, we use the following 
specialization of [27, Theorem 3.13]. 

Proposition 3. If a term t has a normalform wrt. —then all reductions oft are finite. 

Theorem 1. Let I ^ r be a rewrite rule subject to a redex preserving inlining of function f at 
position p in r. Suppose that the symbol f is sufficiently defined by A. Let B be obtained by 
replacing rule I ^ r by the rules inline^^p(Z —>■ r). Then every normalizing derivation with respect 
to A starting from main[di,..., dn) (dj G InputJ of length I is simulated by a derivation with 
respect to B from main{di ,..., d„) of length at least [|j. 

Proof. Suppose t is a reduct of main(di ,... ,dn) occurring in a normalising reduction, i.e., main(di ,... ,dn) — 
t u, for u a normal form of A. In proof, we show if t u is a derivation of length t, then 
there exists a normalising derivation with respect to B whose length is at least [|j. The theorem 
then follows by taking t — main(di ,... ,dn). 

We define the derivation height dh(s) of a term s wrt. the relation 8.S the maximal m such 
that t u holds. The proof is by induction on dh(t), which is well-defined by assumption and 
Proposition 3. It suffices to consider the induction step. Suppose t —s — u. We consider 
the case where the step t —s is obtained by applying the rule I ^ r G A, otherwise, the claim 
follows directly from induction hypothesis. Then as a result of Lemma 2(1) and 2(3) we obtain 
an alternative derivation 



for some term s' and i' satisfying ^ — 1 . Note that s s' as a consequence of Lemma 2(2), 
and thus dh(s) > dh(s') by definition of derivation height. Induction hypothesis on s' thus yields 
a derivation t — s' —J-g u of length at least -I- 1 = ^ 

We can then obtain that inlining has the key property we require on transformations. 

Corollary 1 (Inlining Transformation). The inlining transformation, which replaces a rule I —>■ 
r G A by inline_4_p(Z —^ r), is asymptotically complexity reflecting whenever the function considered 
for inlining is sufficiently defined and the inlining itself is redex preserving. 

Example 6. Consider the ATRS Arev from Section 2. Three applications of inlining result in the 
following ATRS: 

; Cli(/,5) ® 2 ^ f @ (g @ z) 

2 CI2 @ 2 —>■ Z 

3 ClaC x) @ z —>■ x: : z 

4 compi (/) ® g ^ Cli(/, g) 

5 comp 0 / ^ compiC/) 

6 match„aik ( [] ) — 5 “ CI2 

7 match„aik ( a;: ■ ys) 

comp 0 (fix„aik ® ys) 0 ClaCx) 

8 walk @ xs ^ match„aik ( a;s) 

9 fix„aik ® [] CI2 

10 f ix„aik ® {x: : ys) 
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Cli(fix„aik 0 ys yCl^Cx)) 

11 rev @ I fix„aik @ Z @ [] 

12 main(/) fix„aik @ Z 0 [] 


The involved inlining rules are all non-erasing, i.e., all inlinings are redex preserving. As a con¬ 
sequence of Corollary 1, a bound on the runtime complexity of the above system can be relayed, 
within a constant multiplier, back to the ATRS Arev 

Note that the modified system from Example 6 gives further possibilities for inlining. For 
instance, we could narrow further down the call to f ix„aik in rules (10), (11) and (12), performing 
case analysis on the variable ys and Z, respectively. Proceeding this way would step-by-step unfold 
the definition of fix„aik, ad infinitum. We could have also further reduced the rules defining 
match„aik and walk. However, it is not difficult to see that these rules will never be unfolded in a 
call to main, they have been sufficiently inlined and can be removed. Elimination of such unused 
rules will be discussed next. 

4.2 Elimination of Dead Code 

The notion of usable rules is well-established in the rewriting community. Although its precise 
definition depends on the context used (e.g. termination [4] and complexity analysis [29]), the 
notion commonly refers to a syntactic method for detecting that certain rules can never be applied 
in derivations starting from a given set of terms. From a programming point of view, such rules 
correspond to dead code, which can be safely eliminated. 

Dead code arises frequently in automatic program transformations, and its elimination turns 
out to facilitate our transformation-based approach to complexity analysis. The following propo¬ 
sition formalises dead code elimination abstractly, for now. Call a rule I ^ r G A usable if it can 
be applied in a derivation 


main(di,..., dk) —• • • —ti —^2 , 

where di G Input. The rule Z —>■ r is dead code if it is not usable. The following proposition follows 
by definition. 

Proposition 4 (Dead Code Elimination). Dead code elimination, which maps an ATRS A to a 
subset of A by removing dead code only, is complexity reflecting and preserving. 

It is not computable in general which rules are dead code. One simple way to eliminate dead 
code is to collect all the function symbols underlying the definition of main, and remove the defining 
rules of symbols not in this collection, compare e.g. [29]. This approach works well for standard 
TRSs, but is usually inappropriate for ATRSs where most rules define a single function symbol, the 
application symbol. A conceptually similar, but unification based, approach that works reasonably 
well for ATRSs is given in [23]. However, the accurate identification of dead code, in particular 
in the presence of higher-order functions, requires more than just a simple syntactic analysis. We 
show in Section 5.2 a particular form of control flow analysis which leverages dead code elimination. 
The following example indicates that such an analysis is needed. 

Example 7. We revisit the simplified ATRS from Example 6. The presence of the composition 
rule (1), itself a usable rule, makes it harder to infer which of the application rules are dead code. 
Indeed, the unification-based method found in [23] classifies all rules as usable. As we hinted in 
Section 2, the variables / and g are instantiated only by a very limited number of closures in a call 
of main(Z) . In particular, none of the symbols rev, walk, comp and compi are passed to Cli. With 
this knowledge, it is not difficult to see that their defining rules, together with the rules defining 
match„aik, can be eliminated by Proposition 4. Overall, the complexity of the ATRS depicted in 
Example 6 is thus reflected by the ATRS consisting of the following six rules. 
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; Cli(f,g) ® X —>■ f @ (g @ x) 

2 CI2 @ 2 —>■ 0 

3 ClsC x) @2 —>■ x: : z 

4 @ [] —4 CI 2 

5 f ix„aik ® (x: : ys) 

Cli(fix„aik @ ys,ClsCx)) 

6 mainCn fix„aik @ Z @ [] 


4.3 Instantiation 

Inlining and dead code elimination can indeed help in simplifying defunctionalised programs. 
There is however a feature of ATRS they cannot eliminate in general, namely rules whose right- 
hand-sides have head variables, i.e., variables that occur to the left of an application symbol and 
thus denote a function. The presence of such rules prevents FOPs to succeed in all but trivial 
cases. The ATRS from Example 7, for instance, still contains one such rule, namely rule (1), with 
head variables / and g. The main reason FOPs perform poorly on ATRS containing such rules is 
that they lack any form of control flow analysis, and they are thus unable to realise that function 
symbols simulating higher-order combinators are passed arguments of a very specific shape, and 
are thus often harmless. This is the case, as an example, for the function symbol Cli. 

The way out consists in specialising the ATRS rules. This has the potential of highlighting 
the absence of certain dangerous patterns, but of course must be done with great care, without 
hiding complexity under the carpet of non-exhaustive instantiation. All this can be formalised as 
follows. 

Call a rule I' —>■ r' an instance of a rule I —>■ r, if there is a substitution a with V = la and 
r' = ra. We say that an ATRS B is an instantiation of A iff all rules in B are instances of rules 
from A. This instantiation is sufficiently exhaustive if for every derivation 

main(di ,... ,dk) — t\ — t 2 — 

where di S Input, there exists a corresponding derivation 

main((ii,..., dk) —^2 ~^b ' ’' • 

The following proposition is immediate from the definition. 

Theorem 2 (Instantiation Transformation). Every instantiation transformation, mapping any 
ATRS into a sufficiently exhaustive instantiation of it, is complexity reflecting and preserving. 

Example 8 (Continued from Example 7). We instantiate the rule Cli(f,g) 0 a; ^-/ @ (g ® x) 
by the two rules 

; CliCCls.ClaCa;)) 0 2 ClgCa:) 0 (CI 2 0 2 ) 

2 CliCCliC/, £f) .ClsCa;)) 0 2 ^ 

ClAf,g) 0 (CI 2 0 2 ) 

leaving all other rules from the TRS depicted in Example 7 intact. As we reasoned already before, 
the instantiation is sufficiently exhaustive: in a reduction of mainCO for a list I, arguments to Cli 
are always of the form as indicated in the two rules. Note that the right-hand side of both rules 
can be reduced by inlining the calls in the right argument. Overall, we conclude that the runtime 
complexity of our running example is reflected in the ATRS consisting of the following six rules: 

/ CI 2 0 2 —4 2 
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2 Cli(Cl2,Cl3(a:) ) 0 2 ^ x : : z 

3 Cli(Cli(/, ff) .ClsCa:)) 02^ 

Cli(/ , g) 0 (x: : z) 

4 fiXwalk ® [] —^ CI 2 

5 f ix„aik ® (x: : ys) 

Cli(fix„aik ® ys .ClsCx)) 

6 mainCn ^ fix„aik 0 Z 0 [] 


4.4 Uncurrying 

The ATRS from Example 8 is now sufficiently instantiated: for all occurrences of the 0 symbol, we 
know which function we are applying, even if we do not necessarily know to what we are applying 
it. The ATRS is not yet ready to be processed by FOPs, simply because the application function 
symbol is anyway still there, and cannot be dealt with. 

At this stage, however, the ATRS can indeed be brought to a form suitable for analysis by 
FOPs through uncurrying, see e.g. the account of Hirokawa et al. [31]. Uncurrying an ATRS A 
involves the definition of a fresh function symbol f" for each n-ary application 

f (tl, . . . , tyn) ® Zyn-t-l 0 • • • 0 , 

encountered in A. This way, applications can be completely eliminated. Although in [31] only 
ATRSs defining function symbols of null arity are considered, the extension to our setting poses 
no problem. We quickly recap the central definitions. 

Define the applicative arity aa^(f) of a symbol f in A as the maximal n S N such that a term 

f (fi , . . . , tm) ® Zyn+l ® • • • 0 , 

occurs in A. 

Definition 1. The uncurrying ctj of a term t = f(ti,... ,tm) ® Zm+i 0 ■ • ■ 0 tm+n, with n ^ 
aa^(f) is defined as 

ct-l .- f (lZiJ, . . . , i-tjji_i, Ltyyy-l-lJ, . . . , + 5 

where f = f and f" (1 < n < aa_ 4 (f)) are fresh function symbols. Uncurrying is homomorphically 
extended to ATRSs. 

Note that lAj is well-defined whenever A is head variable free, i.e., does not contain a term of 
the form x t for variable x. We intend to use the TRS lAj to simulate reductions of the ATRS 
A. In the presence of rules of functional type however, such a simulation fails. To overcome the 
issue, we rj-saturate A. 

Definition 2 . We call a TRS A rj-saturated if whenever 

f (Zi,..., Z„j) 0 Im+i 0 ■ • • ® Im+n r € A with n < aa^(f), 
then it contains also a rule 

f (Zi, . . . , Im) ® U-t-1 0 • • • 0 ljri-\-n ® ^ ^ T 0 2 , 

where 2 is a fresh variable. The rj-saturation A^ of A is defined as the least extension of A that 
is 77 -saturated. 

Remark. The 77 -saturation A,, of an ATRS A is not necessarily finite. A simple example is the 
one-rule ATRS f —f 0 a where both f and a are function symbols. Provided that the ATRS A 
is endowed with simple types, and indeed the simple typing of our initial program is preserved 
throughout our complete transformation pipeline, the r 7 -saturation of A becomes hnite. 
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Example 9 (Continued from Example 8). The ATRS from Example 8 is not 77 -saturated: f ix„aik 
is applied to two arguments in rule ( 6 ), but its defining rules, rule (4) and (5), take a single 
argument only. The 77 -saturation thus contains in addition the following two rules: 

' fiXwalk @ [] 0 z —CI 2 0 z 

2 fix „alk ® ix\ ■. ys) 0 z ^ 

Cli(fix„aik 0 ys 0 z . 


One can then check that the resulting system is 77 -saturated. 

Lemma 3. Let Arj be the rj-saturation of A. 

1. The rewrite relation —coincides with 

2. Suppose Ari is head variable free. If s —t then lsj — Aj. 

Proof. Eor Property 1, the inclusion C follows trivially from the inclusion A C Ajj. 

The inverse inclusion —A ~^Ar, proven by a standard induction on the derivation of 

1 —y r ^ Arj. 

Property 2 can be proven by induction on t. The proof follows the pattern of the proof 
of Sternagel and Thiemann [49]. Notice that in [49, Theorem 10], the rewrite system is 

enriched with uncurrying rules of the form f *(a;i,..., a;^) 0 7 / —... ,Xn,y). Such an 
extension is not necessary in the absence of head variables. In our setting, the application symbol 
is completely eliminated by uncurrying, and thus the above rules are dead code. □ 

As a consequence, we immediately obtain the following theorem. 

Theorem 3 (Uncurrying Transformation). Suppose that Ar/ is head variable free. The uncurrying 
transformation, which maps an ATRS A to the system is complexity reflecting. 

Example 10 (Continued from Example 9). Uncurrying the 77 -saturated ATRS, consisting of the 
six rules from Example 8 and the two rules from Example 9, results in the following set of rules: 

; Cl] (CI 2 .ClaC a;) , z) —>■ a;: : z 

2 Cl[(Cli(/, g) .ClsC a;) , z) ClJ(/, g , a;: : z) 

3 CljC z) z 

4 f ( [] ) CI 2 

5 f (a;: 7/s) ^ Cli( f i (?/s) .ClsC a;)) 

6 f ( [] , z) Cl\{z) 

^ {x:-.ys,z) 

Gill (f (t/s) .ClaCa;) , z) 

8 mainCn ^ f i( / , [] ) 

Inlining the calls to and CljCz), followed by dead code elimination, results finally in the 

TRS 7?.rev from Section 2. 

5 Automation 

In the last section we have laid the formal foundation of our program transformation methodol¬ 
ogy, and ultimately of our tool HOCA. Up to now, however, program transformations (except for 
uncurrying) are too abstract to be turned into actual algorithms. In dead code elimination, for 
instance, the underlying computation problem (namely the one of precisely isolating usable rules) 
is undecidable. In inlining, one has a decidable transformation, which however results in a blowup 
of program sizes, if blindly applied. 
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This section is devoted to describing some concrete design choices we made when automating 
our program transformations. Another, related, issue we will talk about is the effective combination 
of these techniques, the transformation pipeline. 

5.1 Automating Inlining 

The main complication that arises while automating our inlining transformation is to decide where 
the transformation should be applied. Here, there are two major points to consider: first, we want 
to ensure that the overall transformation is not only complexity reflecting, but also complexity 
preserving, thus not defeating its purpose. To address this issue, we employ inlining conservatively, 
ensuring that inlining does not duplicate function calls. Secondly, as we already hinted after 
Example 6, exhaustive inlining is usually not desirable and may even lead to non-termination in 
the transformation pipeline described below. Instead, we want to ensure that inlining simplifies 
the problem with respect to some sensible metric, and plays well in conjunction with the other 
transformation techniques. 

Instead of working with a closed inlining strategy, our implementation inline (P) is parame- 
terised by a predicate P which, intuitively, tells when inlining a call at position p in a rule I —>■ r 
is sensible at the current stage in our transformation pipeline. The algorithm inline (P) replaces 
every rule Z —)• r by inline_ 4 _p(Z —>■ r) for some position p such that P{p, I —S' r) holds. The following 
four predicates turned out to be useful in our transformation pipeline. The first two are designed 
by taking into account the specific shape of ATRSs obtained by defunctionalisation, the last two 
are generic. 

• match: This predicate holds if the right-hand side r is labeled by a symbol of the form matches 
at position p. That is, the predicate enables inlining of calls resulting from the translation of a 
match-expression, thereby eliminating one indirection due to the encoding of pattern matching 
during defunctionalization. 

• lambda-rewrite: This predicate holds if the subterm r|p is of the form lam2;.e(t) 0 s. Note 
that by definition it is enforced that inlining corresponds to a plain rewrite, head variables 
are not instantiated. For instance, inline(lambda-rewrite) is inapplicable on the rule 
CI 2 (f,g) ® z ^ f @ (g @ z). This way, we avoid that variables / and g are improperly in¬ 
stantiated. 

• constructor: The predicate holds if the right-hand sides of all rules used to inline r|p are 
constructor terms, i.e., do not give rise to further function calls. Overall, the number of 
function calls therefore decreases. As a side effect, more patterns become obvious in rules, 
which facilitates further inlining. 

• decreasing: The predicate holds if any of the following two conditions is satisfied: (i) proper 
inlining: the subterm r|p constitutes the only call-site to the inlined function f. This way, all 
rules defining f in A will turn to dead code after inlining, (ii) size decreasing: each right-hand 
side in inline^^p(Z —>■ r) is strictly smaller in size than the right-hand side r. 

Here, the aim is to facilitate FOPs on the generated output. In the first case, the number of 
rules decreases, which usually implies that in the analysis, a FOP generates less constraints 
which have to be solved. In the second case, the number of constraints might increase, but 
the individual constraints are usually easier to solve, due to the decrease in sizes of right hand 
sides. 

We emphasise that all inlinings performed on our running example Arev are captured by the 
instances of inlining just defined. 

5.2 Automating Instantiation and Dead Code Elimination via Control 
Flow Analysis 

One way to effectively eliminate dead code and apply instantiation, as in Examples 7 and 8, 
consists in inferring the shape of closures passed during reductions. This way, we can on the one 
hand specialise rewrite rules being sure that the obtained instantiation is sufficiently exhaustive, 
and on the other hand discover that certain rules are simply useless, and can thus be eliminated. 
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Figure 3: Over-approximation of the collecting semantics of the ATRS from Example 6. 


To this end, we rely on an approximation of the collecting semantics. In static analysis, the 
collecting semantics of a program maps a given program point to the collection of states attainable 
when control reaches that point during execution. In the context of rewrite systems, it is natural to 
define the rewrite rules as program points, and substitutions as states. Throughout the following, 
we fix an ATRS A = {h ^ ..,«}■ We define the collecting semantics of A as a, tuple 

{Zi ,..., Z„), where 

Zi := {{cr,t) I 3d e Input, main(d) — C[lia] C[ri(T] and Ticrt} . 

Here the substitutions a are restricted to the set Var(?i) of variables occurring in the left-hand 
side in li. 

The collecting semantics of A includes all the necessary information for implementing both 
dead code elimination and instantiation: 

Lemma 4. The following properties hold: 

1. The rule U ^ ri & A constitutes dead code if and only if Zi = 0. 

2. Suppose the ATRS B is obtained by instantiating rules U —>• with substitutions aij,... ,Oi^ki- 
Then the instantiation is sufficiently exhaustive if for every substitution a with {aA) G Zi, 
there exists a substitution Uij (j G {I,... ,ik}) which is at least as general as a. 

Proof. The first property follows by definition. For the second property, consider a derivation 

main(di,..., dfe) C[lia] -)-_4 C[ria] , 

and thus {a,ria) G Zi. By assumption, there exists a substitution atj {i G {I,... ,**}) is at least 
as general as a. Hence the ATRS B can simulate the step from C[lia] C[ria\, using the rule 
li(^i,j —^ G B. From this, the property follows by inductive reasoning. □ 

As a consequence, the collecting semantics of A is itself not computable. Various techniques to 
over-approximate the collecting semantics have been proposed, e.g. by Feuillade et al. [22], Jones 
[32] and Kochems and Ong [35]. In all the works above, the approximation consists in describing 
the tuple {Z \,..., Z„) by a finite object. 

In HOCA we have implemented a variation of the technique of Jones [32], tailored to call-by-value 
semantics (already hinted at in [32]). Conceptually, the form of control flow analysis we perform is 
close to a 0-CFA [39], merging information derived from different call sites. Whilst being efficient 
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to compute, the precision of this relatively simple approximation turned out to be reasonable for 
our purpose. 

The underlying idea is to construct a (regular) tree grammar which over-approximates the 
collecting semantics. Here, a tree grammar Q can be seen as a ground ATRS whose left-hand sides 
are all function symbols with arity zero. The non-terminals of Q are precisely the left-hand sides. 
For the remaining, we assume that variables occurring A are indexed by indices of rules, i.e., every 
variable occuring in the rule k ^ ri G A has index i. Hence the set of variables of rewrite rules 
in A are pairwise disjoint. Besides a designated non-terminal S, the start-symbol, the constructed 
tree grammar Q admits two kinds of non-terminals: non-terminals Ri for each rule li —>■ and 
non-terminals Zi for variables Zi occurring in A. Note that the latter the variable Zi is considered 
as a constant in Q. We say that Q is safe for A if the following two conditions are satisfied for all 
{cr,t) £ Zi'. (i) Zi -^*g a{zi) for each Zi £ Var(li); and (ii) Ri t. This way, Q constitutes a finite 
over-approximation of the collecting semantics of A. 

Example 11. Figure 3 shows the tree grammar Q constructed by the method described below, 
which is safe for the ATRS A from Example 6. The notation —>■ | • • • | is short-hand for 
the n rules N ^ ti. 

The construction of Jones consists of an initial automaton ^Oi which describes considered start 
terms, and which is then systematically closed under rewriting by way of an extension operator 
S{-). Suitable to our concerns, we define Qo as the tree grammar consisting of the following rules: 

S —?> main(* ,...,*) and 

* —5> Cj (*,...,*) for each constructor Cj of A. 

Then clearly S -^g main(di,.. . ,dn) for all inputs di £ Input. We let G be the least set of rules 
satisfying G Go^ S{G) with 


SiG):= U Ext^^''(iV^C[u]) . 

N^C[u]eg 

Here, Ext'^'’''(A^ C[u]) is dehned as the following set of rules: 


r N ^ C[Ri ], 

li ^ ri £ A, '| 

Ri ^ ri, and 

u -^g lid is minimal 

[ <7(2) for all 2 £ Var(li) 

and (T normalised. I 


In contrast to [32], we require that the substitution a is normalised, thereby modelling call-by-value 
semantics. The tree grammar G is computable using a simple fix-point construction. Minimality of 
f (H,..., tfe) — >0 liO means that there is no shorter sequence f (ti,..., — >0 liT with Ut -^g Ua, 

and ensures that G is finite [32], thus the construction is always terminating. 

We illustrate the construction on the ATRS from Example 6. 

Example 12. Revise the ATRS from Example 6. To construct the safe tree grammar as explained 
above, we start from the initial grammar Go given by the rule 

S main(*) *—>[]]*::*, 

and then successively fix violations of the above closure condition. The only violation in the 
initial grammar is caused by the first production. Here, the right-hand side main(*) matches the 
(renamed) rule 12: mainCO —7> fix„ai]j@ 1^2 ® f], using the substitution {I >->•*}. We fix the 
violation by adding productions 

S ' —>■ i ?|2 ^12 ^ 12 ® ^12 * • 
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The tree grammar Q constructed so far tells us that l\2 is a list. In particular, we have the following 
two minimal sequences which makes the left subterm of the R\2 -production an instances of the 
left-hand sides of defining rules of fix„aik (rules (9) and (10)): 

fix„aik® ^12 -tgfix„aik@ 

To resolve the closure violation, the tree grammar is extended by productions 
R\2 — ^ R^ ® R^ — ^ CI 2 

because of rule (9), and by 

-^12 ^ -^10 ® n 2:10 -t * 

i?2_0 Cli(fix„aik® 2/sio >‘" 13 ( 2:10 )) ysiQ -t* • 

due to rule (10). We can now identify a new violation in the production of i?io • Fixing all 
violations this way will finally result in the tree grammar depicted in Figure 3. 

The following lemma confirms that Q is closed under rewriting with respect to the call-by-value 
semantics. The lemma constitutes a variation of Lemma 5.3 from [32]. 

Lemma 5. If S —¥g t and t — C[lia] — C[ria] then S — C[Ri ], Ri Vi and Zi — a{zi) 
for all variables Zi G Var(/i). 

Theorem 4. The tree grammar Q is safe for A. 

Proof. Fix {a,t) € Zi, and let 2 G Var(?i). Thus main(d) C[lia] — C[ria] and ria t 

for some inputs d G Input. As we have S —main(d) since Qq C Q, Lemma 5 yields Ri -^g ri 
and Zi —^g cr(z), i.e., the second safeness conditions is satisfied. Clearly, Ri -^g ri -^*g ria. A 
standard induction on the length of ria t then yields Ri t, using again Lemma 5. □ 

We arrive now at our concrete implementation cfa(A) that employs the above outlined call 
flow analysis to deal with both dead code elimination and instantiation on the given ATRS A. 
The construction of the tree grammar Q follows itself closely the algorithm outlined by Jones 
[32]. Recall that the i*'' rule k ^ ri G A constitutes dead code if the i*'' component Zi of the 
collecting semantics of A is empty, by Lemma 4(1). Based on the constructed tree grammar, the 
implementation identifies rule h —> ri as dead code when Q does not define a production Ri ^ t 
and thus Zi = 0. All such rules are eliminated, in accordance to Proposition 4. On the remaining 
rules, our implementation performs instantiation as follows. We suppose e-productions N —> M, 
for non-terminals M, have been eliminated by way of a standard construction, preserving the set 
of terms from non-terminals in Q. Thus productions in Q have the form N —>■ f (fi,..., tk). Fix a 
rule li ^ ri € A. The primary goal of this stage is to get rid of head variables, with respect to 
the ry-saturated ATRS Arj, thereby enabling uncurrying so that the ATRS A can be brought into 
functional form. For all such head variables z, then, we construct a set of binders 

{zi fresh(f(ti,... ,4)) | 2 ^ f(ti,... ,4) G (?} , 

where the function fresh replaces non-terminals by fresh variables, discarding binders where the 
right-hand contains defined symbols. For variables 2 which do not occur in head positions, we 
construct such a binder only if the production Zi —>■ f{ti,... ,tk) is unique. With respect to the 
tree grammar of Figure 3, head variables /, g of the rule 1 the implementation generates binders 

{/i Cl 2 ,/i Cliif’ ,Cl3(x’))} and {g^ 013 ( 2 ;’ 0} . 

The product-combination of all such binders gives then a set of substitution {(Tip, ..., that 

leads to sufficiently many instantiations haij —> riaij of rule h —> by Lemma 4(2). Our 
implementation replaces every rule G A by instantiations constructed this way. 
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simplify = sim 

pATRS; toTRS; simpTRS where 

simpATRS = 


exhaustive 

inline(lambda-rewrit e); 

exhaustive 

inline(match); 

exhaustive 

inline(constructor); 

usableRules 

toTRS = cfa; 

uncurry; usableRules 

simpTRS = 


exhaustive 

((inline(decreasing); 


usableRules) <> cfaDCE) 


Figure 4: Transformation Strategy in HOCA. 


The definition of binder was chosen to keep the number of computed substitutions minimal, 
and hence the generated head variable free ATRS small. Putting things together, we see that the 
instantiation is sufficiently exhaustive, and thus the overall transformation is complexity reflecting 
and preserving by Theorem 2. By cf aDCE we denote the variation of cfa that performs dead code 
elimination, but no instantiations. 

5.3 Combining Transformations 

We have now seen all the building blocks underlying our tool HOCA. But in which order should 
we apply the introduced program transformations? In principle, one could try to blindly iterate 
the proposed techniques and hope that a FOP can cope with the output. Since transformations 
are closed under composition, the blind iteration of transformations is sound, although seldom 
effective. In short, a strategy is required that combines the proposed techniques in a sensible 
way. There is no clear notion of a perfect strategy. After all, we are interested in non-trivial 
program properties. However, it is clear that any sensible strategy should at least (i) yield overall 
a transformation that is effectively computable, (ii) not defeat its purpose by generating TRSs 
whose runtime complexity is not at all in relation to the complexity of the analysed program, and 
(iii) produce ATRSs that FOPs are able to analyse. 

In Figure 4 we render the current transformation strategy underlying our tool HOCA. More 
precise. Figure 4 defines a transformation simplify based on the following transformation comhi- 
nators: 

• /i; /2 denotes the composition /2 o/i, where /i (A) = /i (A) if defined and /i (A) = A otherwise; 

• the transformation exhaustive/ iterates the transformation / until inapplicable on the current 
problem; and 

• the operator <> implements left-biased choice: /i <> /2 applies transformation /i if successful, 
otherwise /2 is applied. 

It is easy to see that all three combinators preserve the two crucial properties of transformations, 
viz, complexity reflection and complexity preservation. 

The transformation simplify depicted in Figure 4 is composed out of three transformations 
simpATRS, toTRS and simpTRS, each itself defined from transformations inline (P) and cfa de¬ 
scribe in Sections 5.1 and 5.2, respectively, the transformation usableRules which implements 
the aforementioned computationally cheap, unification based, criterion from [23] to eliminate 
dead code (see Section 4.2), and the transformation uncurry, which implements the uncurrying- 
transformation from Section 4.4. 

The first transformation in our chain, simpATRS, performs inlining driven by the specific shape 
of the input ATRS obtained by defunctionalisation, followed by syntax driven dead code elim¬ 
ination. The transformation toTRS will then translate the intermediate ATRSs to functional 
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Table 1: Experimental Evaluation conducted with TiJ’ and T]T 2 . 




constant 

linear 

quadratic 

polynomial 

terminating 

D 

# systems 
FOP execution time 

2 

0.37 / 1.71 / 3.05 

5 

0.37/4.82 / 13.85 

5 

0.37 / 4.82 / 13.85 

5 

0.37 / 4.82 / 13.85 

8 

0.83 / 1.38 / 1.87 

S 

# systems 
HOCA execution time 
POP execution time 

2 

0.01 / 2.28 / 4.56 
0.23 / 0.51 / 0.79 

14 

0.01/0.54/ 4.56 
0.23 / 2.53 / 14.00 

18 

0.01/0.43/ 4.56 
0.23 / 6.30 / 30.12 

20 

0.01/0.42/ 4.56 

0.23 / 10.94 / 60.10 

25 

0.01 / 0.87 / 6.48 
0.72 / 1.43/3.43 


form by the uncurrying transformation, using control flow analysis to instantiate head variables 
sufficiently and further eliminate dead code. The transformation simpTRS then simplifies the ob¬ 
tained TRS by controlled inlining, applying syntax driven dead code elimination where possible, 
resorting to the more expensive version based on control flow analysis in case the simplification 
stales. To understand the sequencing of transformations in simpTRS, observe that the strategy 
inline (decreasing) is interleaved with dead code elimination. Dead code elimination, both in 
the form of usableRules and cfaDCE, potentially restricts the set inline^_p(Z r), and might facil¬ 
itate in consequence the transformation inline(decreasing) . Importantly, the rather expensive, 
flow analysis driven, dead code analysis is only performed in case both inline (decreasing) and 
its cheaper cousin usableRules fail. 

To see termination, it suffices to realize that all exhaustive applications of transformations in 
simplify are terminating: 

• For inline(match) this claim is immediate by the shape of input ATRSs. Each application 
of inline (match) removes one occurrence of a closure-constructor obtained from the transfor¬ 
mation of a match-expression in right-hand sides. 

• Similar, exhaustive application of inline (constructor) is terminating, since at each step the 
number of defined symbol in right-hand sides is reduced. 

• For iterated application of inline(lambda-rewrite) the claim is less obvious. Intuitively, 
termination holds because the rewritings performed on right-hand sides correspond to steps 
with respect to a very restricted fragment of PCF, which is itself terminating: the simply typed 
A-calculus. Note that the restriction to rewrites is essential, as soon as we allow inlining by 
narrowing, termination is not guaranteed. 

• Concerning the final case, by way of contradiction suppose that 

(inline(decreasing); usableRules) <> cfaDCE , 

is applied infinitely often. Dead code elimination cannot be the culprit, indeed, inline (decreasing)l 
can then be applied infinitely often. In such a sequence, the case proper inlining underlying 
the definition of the predicate decreasing cannot hold infinitely often, as the number of de¬ 
fined symbols in right-hand sides decrease after each application. Hence ultimately, an infinite 
application of inline(decreasing) has to happen due to the size decreasing condition. But 
in such a sequence, the multiset of sizes of right-hand sides is decreasing with respect to the 
multiset extension of the strict order > on naturals, which itself is well-founded. Contradiction! 
Although we cannot give precise bounds on the runtime complexity in general, in practice the 
number of applications of inlinings is sufficiently controlled to be of practical relevance. Impor¬ 
tantly, the way inlining and instantiation is employed ensures that the sizes of all intermediate 
TRSs are kept under tight control. 

6 Experimental Evaluation 

So far, we have covered the theoretical and implementation aspects underlying our tool HOCA. 

The purpose of this section is to indicate how our methods performs in practice. To this end, we 
compiled a diverse collection of higher-order programs from the literature [21, 34, 41] and standard 
textbooks [14, 45], on which we performed tests with our tool in conjunction with the general- 
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purpose first-order complexity tool TcJ [8], version 2.1.® For comparison, we have also paired HOCA 
with the termination tool [36], version 1.15. 

In Table 1 we summarise our experimental findings on the 25 examples from our collection.^ 
Row S in the table indicates the total number of higher-order programs whose runtime could be 
classified linear, quadratic and at most polynomial when HOCA is paired with the back-end TclT, and 
those programs that can be shown terminating when HOCA is paired with T]T2. In contrast, row D 
shows the same statistics when the FOP is run directly on the defunctionalised program, given by 
Proposition 2. To each of those results, we state the minimum, average and maximum execution 
time of HOCA and the employed FOP. All experiments were conducted on a machine with a 8 dual 
core AMD Opteron 885 processors running at 2.60GHz, and 64Gb of RAM.® Furthermore, the 
tools were advised to search for a certificate within 60 seconds. 

As the table indicates, not all examples in the testbed are subject to a runtime complexity 
analysis through the here proposed approach. However, at least termination can be automatically 
verified. For all but one example (namely mapplus. fp) the obtained complexity certificate is 
asymptotically optimal. As far as we know, no other fully automatic complexity tool can handle 
the hve open examples. We will comment below on the reason why HOCA may fail. 

Let us now analyse some of the programs from our testbed. For each program, we will briefly 
discuss what HOCA, followed by selected FOPs can prove about it. This will give us the opportunity 
to discuss about specific aspects of our methodology, but also about limitations of the current 
FOPs. 

Reversing a List. Our running example, namely the functional program from Section 2 which 
reverses a list, can be transformed by HOCA into an ATRS which can easily be proved to have 
linear complexity. Similar results can be proved for other programs. 

Parametric Insertion Sort. A more complicated example is a higher-order formulation of the 
insertion sort algorithm, example isort-fold.fp, which is parametric on the subroutine which 
compares the elements of the list being sorted. This is an example which cannot be handled by 
linear type systems [12]: we do recursion over a function which in an higher-order variable occurs 
free. Also, type systems like the ones in [34], which are restricted to linear complexity certificates, 
cannot bind the runtime complexity of this program. HOCA, instead, is able to put it in a form 
which allows ICT to conclude that the complexity is, indeed quadratic. 

Divide and Conquer Combinators. Another noticeable example is the divide an conquer 
combinator, defined in example mergesort-dc. fp, which we have taken from [45]. We have then 
instantiated it so that the resulting algorithm is the merge sort algorithm. HOCA is indeed able to 
translate the program into a first-order program which can then be proved to be terminating by 
FOPs. This already tells us that the obtained ATRS is in a form suitable for the analysis. The 
fact that FOPs cannot say anything about its complexity is due to the limitations of current FOPS 
which, indeed, are not able to perform any non-local size analysis, itself a necessary condition for 
proving merge sort to be a polynomial time algorithm. Similar considerations hold for Okasaki’s 
parser combinator, various instances of which can be proved themselves terminating. 

7 Related Work 

What this paper shows is that complexity analysis of higher-order functional programs can be 
made easier by way of program transformations. As such, it can be seen as a complement rather 
than an alternative to existing methodologies. Since the literature on related work is quite vast, 
we will only give in this section an overview of the state of the art, highlighting the differences 
with to our work. 

®We ran also experiments with AProVE and ffiT as back-end, this however did not extend the power. 

^Examples and full experimental evidence can be found on the HOCA homepage. 

®Average PassMark CPU Mark 2851; http;//www.cpubenclimark.net/ . 
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Control Flow Analysis. A clear understanding of control flow in higher-order programs is 
crucial in almost any analysis of non-functional properties. Consequently, the body of literature 
on control flow analysis is considerable, see e.g. the recent survey of Midtgaard [38]. Closest to 
our work, control flow analysis has been successfully employed in termination analysis, for brevity 
we mention only [24, 33, 42]. By Jones and Bohr [33] a strict, higher-order language is studied, 
and control flow analysis facilitates the construction of size-change graphs needed in the analysis. 
Based on earlier work by Panitz and Schmidt-Schaufi [42], Giesl et al. [24] study termination 
of Haskell through so-called termination or symbolic execution graphs, which under the hood 
corresponds to a careful study of the control flow in Haskell programs. While arguable weak 
dependency pairs [29] or dependency triples [40] form a weak notion of control flow analysis, our 
addition of collecting semantics to complexity analysis is novel. 

Type Systems. That the role of type systems can go beyond type safety is well-known. The 
abstraction type systems implicitly provide, can enforces properties like termination or bounded 
complexity. In particular, type systems for the A-calculus are known which characterise relatively 
small classes of functions like the one of polynomial time computable functions [12]. The principles 
underlying these type systems, which by themselves cannot be taken as verihcation methodologies, 
have been leveraged while defining type systems for more concrete programming languages and 
type inference procedures, some of them being intensionally complete [17, 19]. All these results 
are of course very similar in spirit to what we propose in this work. What is lacking in most 
of the proposed approaches is the presence, at the same time, of higher-order, automation, and 
a reasonable expressive power. As an example, even if in principle type systems coming from 
light logics [12] indeed handle higher-order functions and can be easily implementable, the class of 
catched programs is small and full recursion is simply absent. On the other hand, dost et al. [34] 
have successfully encapsulated Tarjan’s amortised cost analysis into a type systems that allows 
a fully automatic resource analysis. In contrast to our work, only linear resource usage can be 
established. However, their cost metric is general, while our technique only works for time bounds. 
Also in the context of amortised analysis, Danielsson [20] provides a semiformal verification of 
the runtime complexity of lazy functional languages, which allows the derivation of non-linear 
complexity bounds on selected examples. 

Term Rewriting. Traditionally, a major concern in rewriting has been the design of sound algo¬ 
rithmic methodologies for checking termination. This has given rise to many different techniques 
including basic techniques like path orders or interpretations, as well as sophisticated transfor¬ 
mation techniques, c.f. [50, Chapter 6]. Complexity analysis of TRSs can be seen as a natural 
generalisation of termination analysis. And, indeed, variations on path orders and the interpreta¬ 
tion methods capable of guaranteeing quantitative properties have appeared one after the other 
starting from the beginning of the nineties [7, 15, 37]. In both termination and complexity anal¬ 
ysis, the rewriting community has always put a strong emphasis to automation. However, with 
respect to higher-order rewrite systems (HRSs) only termination has received steady attention, 
c.f. [50, Chapter 11]. Except for very few attempts without any formal results complexity analysis 
of HRSs has been lacking [11, 16]. 

Cost Functions. An alternative strategy for complexity analysis consists in translating pro¬ 
grams into other expressions (which could be programs themselves) whose purpose is precisely 
computing the complexity (also called the cost) of the original program. Complexity analysis is 
this way reduced to purely extensional reasoning on the obtained expressions. Many works have 
investigated this approach in the context of higher-order functional languages, starting from the 
pioneering work by Sands [47] down to more recent contributions, e.g. [51]. What is common 
among most of the cited works is that either automation is not considered (e.g. cost functions 
can indeed be produced, but the problem of putting them in closed form is not [51]), or the time 
complexity is not analysed parametrically on the size of the input [26]. A notable exception is 
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Benzinger’s work [13], which however only applies to programs extracted from proofs, and thus 
only works with primitive recursive definitions. 
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